Kongens Lyngby
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Denmark > Capital Region > Kongens Lyngby (0.04)
- North America > United States > New Jersey > Hudson County > Secaucus (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > United Kingdom > England > Tyne and Wear > Sunderland (0.04)
- (2 more...)
- Europe > Denmark > Capital Region > Kongens Lyngby (0.14)
- North America > United States > Virginia (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Switzerland (0.04)
Gradients of Functions of Large Matrices
Tuning scientific and probabilistic machine learning models - for example, partial differential equations, Gaussian processes, or Bayesian neural networks - often relies on evaluating functions of matrices whose size grows with the data set or the number of parameters. While the state-of-the-art for evaluating these quantities is almost always based on Lanczos and Arnoldi iterations, the present work is the first to explain how to differentiate these workhorses of numerical linear algebra efficiently. To get there, we derive previously unknown ad-joint systems for Lanczos and Arnoldi iterations, implement them in JAX, and show that the resulting code can compete with Diffrax when it comes to differentiating PDEs, GPyTorch for selecting Gaussian process models and beats standard factorisation methods for calibrating Bayesian neural networks. All this is achieved without any problem-specific code optimisation.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Denmark > Capital Region > Kongens Lyngby (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (4 more...)
Approximate Inference Turns Deep Networks into Gaussian Processes
Mohammad Emtiyaz E. Khan, Alexander Immer, Ehsan Abedi, Maciej Korzepa
We present theoretical results aimed at connecting the training methods of deep learning and GP models. We show that the Gaussian posterior approximations for Bayesian DNNs, such as those obtained by Laplace approximation and variational inference (VI), are equivalent to posterior distributions ofGPregression models.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
- Europe > Switzerland > Vaud > Lausanne (0.05)
- (2 more...)
Fast Sampling for Flows and Diffusions with Lazy and Point Mass Stochastic Interpolants
Damsholt, Gabriel, Frellsen, Jes, Ditlevsen, Susanne
Stochastic interpolants unify flows and diffusions, popular generative modeling frameworks. A primary hyperparameter in these methods is the interpolation schedule that determines how to bridge a standard Gaussian base measure to an arbitrary target measure. We prove how to convert a sample path of a stochastic differential equation (SDE) with arbitrary diffusion coefficient under any schedule into the unique sample path under another arbitrary schedule and diffusion coefficient. We then extend the stochastic interpolant framework to admit a larger class of point mass schedules in which the Gaussian base measure collapses to a point mass measure. Under the assumption of Gaussian data, we identify lazy schedule families that make the drift identically zero and show that with deterministic sampling one gets a variance-preserving schedule commonly used in diffusion models, whereas with statistically optimal SDE sampling one gets our point mass schedule. Finally, to demonstrate the usefulness of our theoretical results on realistic highly non-Gaussian data, we apply our lazy schedule conversion to a state-of-the-art pretrained flow model and show that this allows for generating images in fewer steps without retraining the model.
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- North America > United States > Arizona (0.04)
- Europe > United Kingdom > Scotland (0.04)
- (3 more...)
- Transportation > Ground > Road (1.00)
- Transportation > Passenger (0.67)
Lifting Biomolecular Data Acquisition
Weinstein, Eli N., Slabodkin, Andrei, Gollub, Mattia G., Dobbs, Kerry, Cui, Xiao-Bing, Zhang, Fang, Gurung, Kristina, Wood, Elizabeth B.
One strategy to scale up ML-driven science is to increase wet lab experiments' information density. We present a method based on a neural extension of compressed sensing to function space. We measure the activity of multiple different molecules simultaneously, rather than individually. Then, we deconvolute the molecule-activity map during model training. Co-design of wet lab experiments and learning algorithms provably leads to orders-of-magnitude gains in information density. We demonstrate on antibodies and cell therapies.
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > Denmark > Capital Region > Kongens Lyngby (0.04)
Learning Conditional Independence Differential Graphs From Time-Dependent Data
Estimation of differences in conditional independence graphs (CIGs) of two time series Gaussian graphical models (TSGGMs) is investigated where the two TSGGMs are known to have similar structure. The TSGGM structure is encoded in the inverse power spectral density (IPSD) of the time series. In several existing works, one is interested in estimating the difference in two precision matrices to characterize underlying changes in conditional dependencies of two sets of data consisting of independent and identically distributed (i.i.d.) observations. In this paper we consider estimation of the difference in two IPSDs to characterize the underlying changes in conditional dependencies of two sets of time-dependent data. Our approach accounts for data time dependencies unlike past work. We analyze a penalized D-trace loss function approach in the frequency domain for differential graph learning, using Wirtinger calculus. We consider both convex (group lasso) and non-convex (log-sum and SCAD group penalties) penalty/regularization functions. An alternating direction method of multipliers (ADMM) algorithm is presented to optimize the objective function. We establish sufficient conditions in a high-dimensional setting for consistency (convergence of the inverse power spectral density to true value in the Frobenius norm) and graph recovery. Both synthetic and real data examples are presented in support of the proposed approaches. In synthetic data examples, our log-sum-penalized differential time-series graph estimator significantly outperformed our lasso based differential time-series graph estimator which, in turn, significantly outperformed an existing lasso-penalized i.i.d. modeling approach, with $F_1$ score as the performance metric.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > Alabama > Lee County > Auburn (0.04)
- North America > Costa Rica (0.04)
- (4 more...)
- Health & Medicine (1.00)
- Banking & Finance > Trading (0.68)
- Europe > Denmark > Capital Region > Kongens Lyngby (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
- North America > United States > New Jersey > Hudson County > Secaucus (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > United Kingdom > England > Tyne and Wear > Sunderland (0.04)
- (2 more...)